Partially Observable Multi-Sensor Sequential Change Detection: A Combinatorial Multi-Armed Bandit Approach

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combinatorial Multi-Objective Multi-Armed Bandit Problem

In this paper, we introduce the COmbinatorial Multi-Objective Multi-Armed Bandit (COMOMAB) problem that captures the challenges of combinatorial and multi-objective online learning simultaneously. In this setting, the goal of the learner is to choose an action at each time, whose reward vector is a linear combination of the reward vectors of the arms in the action, to learn the set of super Par...

متن کامل

Material for ” Combinatorial multi - armed bandit

We use the following two well known bounds in our proofs. Lemma 1 (Chernoff-Hoeffding bound). Let X1, · · · , Xn be random variables with common support [0, 1] and E[Xi] = μ. Let Sn = X1 + · · ·+Xn. Then for all t ≥ 0, Pr[Sn ≥ nμ+ t] ≤ e−2t /n and Pr[Sn ≤ nμ− t] ≤ e−2t /n Lemma 2 (Bernstein inequality). Let X1, . . . , Xn be independent zero-mean random variables. If for all 1 ≤ i ≤ n, |Xi| ≤ k...

متن کامل

MULTI–ARMED BANDIT FOR PRICING Multi–Armed Bandit for Pricing

This paper is about the study of Multi–Armed Bandit (MAB) approaches for pricing applications, where a seller needs to identify the selling price for a particular kind of item that maximizes her/his profit without knowing the buyer demand. We propose modifications to the popular Upper Confidence Bound (UCB) bandit algorithm exploiting two peculiarities of pricing applications: 1) as the selling...

متن کامل

Online Multi-Armed Bandit

We introduce a novel variant of the multi-armed bandit problem, in which bandits are streamed one at a time to the player, and at each point, the player can either choose to pull the current bandit or move on to the next bandit. Once a player has moved on from a bandit, they may never visit it again, which is a crucial difference between our problem and classic multi-armed bandit problems. In t...

متن کامل

Combinatorial Multi-Armed Bandit with General Reward Functions

In this paper, we study the stochastic combinatorial multi-armed bandit (CMAB) framework that allows a general nonlinear reward function, whose expected value may not depend only on the means of the input random variables but possibly on the entire distributions of these variables. Our framework enables a much larger class of reward functions such as the max() function and nonlinear utility fun...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the AAAI Conference on Artificial Intelligence

سال: 2019

ISSN: 2374-3468,2159-5399

DOI: 10.1609/aaai.v33i01.33015733